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ABSTRACT 

The 30 papers in the area of educational statistics 
that were presented at the 1972 AERA Conference are reviewed. The 
papers are categorized into five broad areas of interest: (1) theory 
of univariate analysis, (2) nonparame trie methods, (3) 
regression-prediction theory, (4) multivariable methods, and (5) 
factor analysis. A list of the papers reviewed, their authors, and, 
when applicable, the ED numbers concludes the summary, (DB) 
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INTRODUCTION 

About 700 of the 1,000 papers presented at the 1972 AERA Annual Meeting in Chicago, 
Illinois were collected by the ERIC Clearinghouse on Tests, Measurement, and Evaluation 
(ERIC/TM). ERIC/TM indexed and abstracted for announcement in Research in 
Education (RIE) 200 papers which fell within our area of interest - testing, measurement, 
and evaluation. The remaining papers were distributed to the other Clearinghouses in the 
ERIC system for processing. 

Because of an interest in thematic summaries of AERA papers on the part of a large 
segmen. of ERiC/TM users, we decided to invite a group of authors to assist ns in 
producnij such a series based on the materials processed for RIE Four topics were 
chosen fpr the series: Criterion Referenced Measurement, Evaluat'on, Statistics, and 
Test Construction. 

Most papers referred to in this summary may be obtained in either hard copy oi 
microfiche form from: 

ERIC Document Reproduction Service (EDRS) 
WO, Drawer 0 
Bethesda, Maryland 20014 

Prices and ordering information for thebc documents may be found in any current 
issue of Research in Education, 



EDUCATIONAL STATISTICS 



Douglas A. Penfield 



Once again the AERA annual meeting has produced an 
abundance of research papers in the area of educational 
statistics with content covering a wide range of theory 
and application. For purposes of discussion, the educa- 
tional statistics papers which were submitted to ERIC are 
broken into five broad areas of interest: (a) theory of 



univariate analysis, (b) nonparametric methods, (c) regres- 
sion-prediction theory, (d) multivanable methods, and 
(e) factor analysis. As was true at the 1971 AGRA 
convention, research on factor analytic methods repre- 
sented the most frequently discussed topic. 



THEORY OF UNIVARIATE ANALYSIS 



Papers in this section raPy from esting assumptions 
under a univariate model *a a discuyion of a randomized 
block design for dichctoPi is variables. Not too technical 
in nature, they can easi - be read and understood by 
researchers having a minimum of mathematical training. 

Ramseyer and Tcheng discuss the robustness of the 
studentized range statistic, q, with request to violation of 
the assumptions of normality and homogeneity of vari- 
ance. Using an IBM 360/50 computer, values of q 
were generated foi groups (k) of 3 and 5, each containing 
5 and 15 scores respectively. When k = 3, the homo- 
geneity of variance assumption was violated by allowing 
one of the sample variances to become two and then four 
times as large as the variance in the other two groups. For 
k = 5, the variance in two of the samples was allowed to 
become two and four times larger than the variance in the 
three remaining samples. The normality assumption was 
altered by transforming scores into distributions which 
were positively and negatively skewed, exponential, and 
rectangular. Comparisons are then made between Type 1 
error rates at the 05 and .01 levels. 

The results show that when the homogeneity of 
variance assumption is violated, the observed Type 1 error 
rates are slightly higher than the established rate. Under 
violation of the normality assumption, the observed error 
rates are generally below the fixed level. For simultaneous 
violation of normality and equal variance, the error rates 
are larger than the nominal levels selected. Nevertheless, 
the overall variation between observed and expected error 
rates is minimal and closely resembles results obtained 
when the assumptions are satisfied. 

The study is particularly interesting because compari- 
sons on uic distribution of q at the .05 and .01 levels are 
also made when all the assumptions are met. This adds 
credence to the random sampling procedure which was 
developed for the study. 

A two group -two treatment experimental design is 
presented by Maxey. The notation and layout arc similar 



the format developed by Campbell and Stanley (1963). 
The series of events for each group consisted of two 
initial observations at different points in time, followed 
by a first treatment, an observation, a second treatment, 
and a final observation. The advantages and purposes of 
the design are discussed in detail. 

Since the effects are confounded over treatments, the 
author proposes to set up a 2-way repeat measures design 
to estimate needed error variances. Once the error 
variances are determined, he rses them to evaluate 
planned con;parisons of interest developed around various 
combinations of cell means. 

An example is presented to illustrate a practical use for 
the design, but it has a number of undesirable features, 
including a small sample size (N = 21), and an inappropri- 
ate use of the eiror terms for setting up a number of 
planned comparisons. Thus, it is impossible to place much 
faith in the F values so computed. The method of 
analysis proposed for tliis extremely complex design 
cannot be justified mathematically. 

Byars and Roscoe describe a procedure for trans- 
forming uniformly distributed data into data having an 
approximate normal distribution. The authors point out 
that this procedure would find its greatest value in Monte 
Carlo type studies where uniformly distributed pseudo- 
random numbers are generated by a computer and must 
subsequently be normally transformed. 

Using the standard normal cumulative distribution 
function, P (z), an algebraic approximation to the inverse 
Gaussian is derived. The authors consider tliis procedure 
to be more accurate and computationally efficient than 
the algebraic approximations developed by Hastings and 
Burr. The Byars-Roscoe approximation involves the use of 
rational polynomial expressions under a hncar data trans- 
formation. When P (z) ranged between .01 and .99, it was 
found to have greater accuracy and require less time to 
compute than any of the other previously discussed 
procedures 
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The implications of pooling the interaction term with 
the error term in a 2-way factorial design when testing for 
main effect differences are discussed by Poh.lmann. After 
reviewing the pros and cons of pooling, Pohlmann 
describes a Monte Carlo study to illustrate the conse- 
quences of this procedure on tests of main effects. The 
variation of parameters described in the MontJ^ Carlo 
study are somewhat difficult to follow, but the effects of 
tl^e pooled error term on main effect differences are 
observed over changes in a non-centrality paiameter, 
sample size and alpha level. 

The results indicate that when the ratio of the 
interaction degrees of freedom i ^ the error degrees of 
freedom is less than 0.08 and the interaction term is not 
significant at the 25 level, it may be useful to pool the 
interaction term with the conventional error component. 
In so doing, one creates a test of main effect which is 
considered to be slightly more powerful than the test 
which uses the standard c.ror term. Due to a lack of 
breadth in the development of the Monte Carlo study, the 
results -dxo not generalixible beyond a 2-way factorial 
design. 

Draper investigated the problem of employing analysis 
of variance procedures to analyze dichotonious repeated 
measures Gdta. Two situations are preseMird. one in which 
dichotomous lesponses are gathered on the same items at 
four separate occasions and a second where the responses 
occur on different items over the four occasions. The 
situations can be represented as a 3-way factorial design 
with items isolated as a contributing source of variation. 

All sources of variation, as well as possible confound- 
ing eiTects and appropriate F-tests, are discussed in detail. 
Using simulated data, a Monte Carlo study was set up 
with variations made in the base probability of a one. 



number of subjects and the degfce of heterogeneity of the 
subjects. Comparisons were then made between normally 
distributed data and dichotomous data. The results indi- 
cate that the power of the analysis of variance test based 
upon dichotomous data is less than one half of the power 
of the same test performed on normal data. Draper 
suggests that if the dependent variable is dichotomous, 
ojie should choose a large sample to insure reasonable 
powe;, for he obtained the greatest power when the 
pioba iiity of a one was near 0.5 and there were 6 or 
more subjects in the experiment 

In a study similar to the Draper one, Mandeville makes 
a comparison among three methods of analyzing dichoto- 
mous data under a randomized biock design. Studies 
summarizing Cochran's Q test and comparing it with the 
F test are noted Mandeville, using dichotomous data, 
then makes comparisons between F, Q, and a multivariate 
test statistic, M, developed around Hotelling's T^. 

Dichotomous data were simulated from a multivariate 
normal distribution and comparisons were made between 
the empirical and theoretical distributions of F, Q and M. 
In general, for varying numbers of treatments and blocks, 
the F statistic has a smallei average error than either Q or 
M, with M being the least desirable of the three. With 
respect to power, F is consistently superior to Q. The M 
statistic is not used to make powei comparisons. On the 
basis of the findings, the F tost is recommended over Q 
or M when: (1) the total sample size is greater than 60, 

(2) the interrelationship between variables is onstant, and 

(3) the data is believed to come from an underlying 
normal distribution. 

Draper's and Mandeville's studies are both well con- 
ceptualized and executed, although the one by Draper is 
slightly more global in nature. 



NONPARAMETRIC METHODS 
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Recently, research into nonparametric statistical methods 
has received a miuiinum of attention from behavioral 
scientists. One explanation may be the large number of 
studies investigating and confirming the robustness of 
many of the parametric proceduios currently in use. Of 
the papers received for review by ERIC, only two could 
be placed in the category of nonparametric methods. One 
deals with describing and comparing some nonparametric 
tests useful for testing equality of variance m the two 
sample problem, and the other investigates some Chi- 
square and Kolmogorov models for testing goodness of fit 
to normal. 

Penfield contracts three nonparametric tests for scale, 
focusing on their method of development and usage. He 
chose the Siegel-Tiikey test, the Mood test and the 
Normal Scores test, but paid special attention to the 



Normal Scores test because of its excellent power relative 
to t[»e parametric F test. Procedures for computing the 
test statistic for large and small samples are outlined in 
detail for all three tests of dispersion. Two examples, one 
for small N and the other for large N, are presented. 
Results are then computed and compared for the three 
tests. In the case of the large sample example, the normal 
approximation to the exact test is illustrated, and power 
comparisons are made. Of the tests considered, the power 
of the Normal Scores test is greatest when scores are 
drawn from distributions having sharp tails. For distri- 
butions having heavy tails, the Sic^.l-Tukey test is 
superior to the other nonparametric tests. 

A study of the robustness of the Chi-square and 
KolmC^gorov statistics under the linear score scale and 
equal areas models is reported by Kittleson and Roscoe. 



The authors restrict themselves to investigating the good- 
ness of fit of data relative to a normal distribution. They 
present a brief review of the literature on the use of the 
Chi-square test for goodness of fit and the Kolmogorov 
test. 

Using normally distributed and uniformly distributed 
random numbers generated by a computer, Chi-scfuare and 
Kolmogorov statistics are computed on samples under the 



linear icore scale and equal areas models for varying 
numbers of sample sizes and cells. When comparing, 
nominal and empirical Type i error rates,^ the Chi-squa.e 
equal areas model proves superior to all other tests. The 
Kolmogorov te3ts are found to be very conservative and 
considered inferior to the Chi-square tests. The best 
power is obtained when the number of cells approximates 
20. This study is clear, concise and well-executed. 



RFGRESSION-PREDICTION THEORY 



Of the papers reviewed in this category, the most 
prevelant topic pertained to the development of linear 
programming models. There was also considerable niterest 
in the use of regression analysis to answer questions 
normally ..vestigated by means of analysis of variance 
and analysis of covariance. 

Referencing Cohen's work on contrast coding for 
multiple linear regression models, Lewis and Mouw extend 
the work to include orthogonal comparisons. The authors 
have broken their discussion into two parts, first showing 
how analysis of variance models can be written in 
regression form, and then treating analysis of covaiiance 
models in a similar fashion. In the case of analysis of 
variance, discussion is restricted to one-way and two-way 
designs, and orthogonal coefficients for setting up trend 
contrasts under the regression model are introduced. An 
illustration of different arrays of coding coefficients in 
the predictor vectors is presented for various contrasts of 
interest. The authors recommend this procedure over the 
conventional analysis of variance because the use of 
mdependent predictor vectors accurately reflects the 
degrees of freedom for the analysis, and also permits the 
investigation of specific contrasts of interest, in addition 
to the depiction of overall main effect differences. 

In a shnilar fashion, one-way and two-way models 
under the analysis of covariance are described. The 
authors give specific attention to pooling the interaction 
with the error term when the interaction is not found to 
be significant. They note that contrast coding does not 
require this pooling, thereby yielding identical results to 
the traditional analysis of covariance model. 

Greenberg and Mejias investigate a use of linear 
least-square multiple regression analysis with dummy 
variables for isolating the effect of the individual teacher 
on student achievement. The sample under investigation 
consists of 572 students enrolled m a social science course 
ai Miami Dade Junior College. Independent variables 
used for prediction purposes consist of an English 
Aptitude and a Social Science score on the Morida State- 
Wide Twelfth Grade Test (I T.(;.), grade point average, 
class sl/e, cumulative hours eanied, and dummy variables 
representing instructor input. The dependent variable is 



the student's final exam score in the social science course. 
Dummy variables are used to determine whether differ- 
emiated mstruction accounts for variation in the student's 
final exam scores. 

Results indicate that the Social Science score on 
F.T.G., grade point average, and cumulative hours earned 
account for the greatest sources of variation in final exam 
scores, explaining 48% of the total variance. A significant 
difference is found between instructors and, on the basis 
of the results, it was possible to rank order instructor 
performance. Class size was not related to final exam 
score. Furthermore, neither salary nor salary-related indi- 
ces were significantly correlated with teacher's contribu- 
tion to student achievement. Limitations involved with 
the design and analysis of the study are thoroughly 
discussed. 

AID-4, automatic interaction detector, is described by 
Koplyay as a procedure for identifying optimal configu- 
rations of predictor variables for criterion prediction 
under a restricted multiple regression model. Lnstead of 
starting with a full multiple regression model, AID4 starts 
with the group as a unit and tlirough a splitting process 
maximizes the between sum of squares foi variable 
categories while minimizing the error sum of squares. The 
value of this procedure lies in its ability to maximize the 
proportion of explained variance in the criterion variable 
without having to identify all the interaction components 
that are present under the full mode:. For regression 
analyses built around a large number of predictor vari- 
ables, AID4 identifies the generally small subset of these 
variables which proves to be significant. A branching 
process developed around the outcomes derived from this 
program makes it possible to give a more meaningful 
interpretation to the results. As a help to potential users, 
an explanation of some of the more useful and informa- 
tive fea !res of the AID-4 output is also presented. This 
procedui • is certainly noteworthy and should be of major 
interest t', those researchers who prefer to let a machine 
aid them in the decision making process, 

Schnittjer attempts to develop a linear programming 
model which would be useful fo' prediction and then 
tests its accuracy by comparing it with the standard 
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ur. ir- A multiple regression equation. Weights were 
i'^sl ;d lo the levels of each variable so that the 
dife-oiKe between actual and predicted scores would be 
mi iiniai. The objective function was the sum of these 
differ aces and was subject to constraints described as 
persc:u variable and level width. An example is presented 
using 66 subjects and is based upon 10 independent 
variables and 4 dependent variables, each analyzed separ- 
ately Independent variables were divided into levels 
ranging from 5 to 27. '-'ollowmg comparisons between the 
linear prograjuming and curvilinear multiple regression 
models, subsets of 50 individuals were randomly selected 
and their linear programming results compared with 
results from the 16 individuals not included in the 
sample. The author concludes that the two n.odels give 
comparable results. The study is extremely vague and 
provides no hint as to the natuie of the actual equations 
computed. A test of accuracy is not indicated, which 
suggests that it was made by ihe "eyeball" method. 

A linear programming model designed to make optimal 
assignment of students to attendance centers is presented 
by Ontjes. An object function is developed which mini- 
mizes the distance students must be bused in order to 
reach their assigned centers. Some constraints on tlie 
system are the capacity of the school building, grade 
capacity, and the need to assign everyone witliin an area 
to one school. Model and constraint equations are lai^^ 
jut in detail. An example illustrating the use of the 
model on junior high school students is presented, the 
purpose of which is to minimize busing while providing ^ 
good racial balance. The example is interesting because of 
Its implications relative to current demands being placed 
upon school systems. Using results generated by a 
computer, a suir.iTiary shows the average distance travelled 
and minority percentage within each school. The study is 
clear, easy to read and has some definite practical 
application. 

Matzke formulates a linear programming model to 
simulate a foundation-type support program. The model is 
then applied to a state support program for the public 
schools in Iowa. Five objective functions were developed 
in order to minimize several derivatives of the state 
mandated local tax rate, to minimize state aid costs ol 
the foundation's program, and to maximize the founda- 
tion's level of support. The j^eneral linear programming 
model is stated mdihematically, as are the constraints on 
the model These constraints feP into 3 general categories 
entitled district, system, and variable interaction. Inputs 
t(^ the linear programming model consisted of data 
oblair.ed from each school district and the Iowa State 
Department of Educaiion. The ecjuations are analyzed by 
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a computer and the results for each optimization problem 
are reported. The data analysis is quite extensive and 
involves solutions for the distribution of funds to a 
foundation-type program. Tables which show the results 
under the optimal solution are also provided. The bibli- 
ography at the end of the paper is excellent. 

Under some linear models the values of the inde- 
pendent variables are assumed to be fixed ather than 
random. Calkins and Jennings investigate the effects of 
violating this assumption for a simple linear regression 
model by observing the number of incorrect decisions 
made when testing slope and intercept differences under 
the assumption that the concomitant variable is random 
instead of fixed. Values of the concomitant variable were 
drawn from both a normal and rectangular distribution 
and results were computed on simulated samples of 
varying size. Sampling distributions for slopes and inter- 
cepts were generated and then compared using critical 
statistics. 

The results are elaborately laid out in tabular form. It 
suffices to say that violation of the fixed variable 
assumption does not produce significant observed and 
expected differences with respect to intercepts and slopes. 
To insure robustness, sample sizes greater than thirteen 
should be chosen. 

Friedman examines the postulate that improved pre- 
diction of multiple criteria can be achieved through the 
use of pattern analysis rather than conventional regression 
models. Pattern analysis in this instance implies a re- 
structuring of the data so as to increase the accuracy of 
prediction. This restructuring is handled primarily through 
the^se of factor analysis. 

The author hypothesi/ej that: (1) for a single cri- 
terion, simple linear combinations of predictor variables 
will perform as well as a combination of linear and 
nonlinear variables; and (2) for predicting multiple cri- 
teria, nonlinear and linear combinations of predictor 
variables will yield a greater multiple correlation coeffi- 
cient than a simple linear combination of the independent 
variables. 

Data was collected on 700 subjects over 10 scales of 
the Parent-Child Relations Questionnaire and 6 scales of 
the California Achievement Test. Analysis consists of 
finding factor scoies, canonical correlations and multiple 
correlation coefficients with respect to males, females and 
:'^c total group. Results indicate that a simple linear 
c'^mbination of variables gives the best predication of a 
single criterion. When predicting multiple criteria,^ the 
results are not as clearly defined since nonlinear variables 
appear to add valuable information to the prediction 
process. 



MULTIVARIATE MfiFHODS 



This section is the most diversified o'' the five areas being 
covered It cont;.ins papers ^n such complex topics as 
time series analysis, muhivariate analysis of variance, 
discriminant analysis, and interaction ai.alysis. The texts 
range from sophisticated mathematical notation to simple 
descriptions of empirical research. Because of their infor- 
mative nature, a number of these papers would be 
valuable reading in an advanced tJucational statistics 
course. 

Raw gain scares, residual gain scores and adjusted 
scores derived from an analysis of covariance are com- 
pared by VViiliams, Maresh and Peebles for the two sample 
problems. These comparisons are empirical in nature and 
are based on reading scores obtained by 165 pupils 
attending rural North Dakota schools. Using notations 
outlined under a full and restricted multiple regression 
model, tlie authors formulate the F statistic for the 
analysis of covanance. Besides gain scores and the analysis 
of covariance, the authors recommend the use of residual 
e^in scores for determining differences between the iwo 
groups. Theii proposed value lies in the fact that they are 
uncorrelated, can be defmed precisely, and lend them- 
selves to determination of higlier ordered residual gains. 
The three procedures were used to test for group dif- 
ferences on reading related variables across grades 2 through 
6 Aside from finding differences m the outcomes derived 
from the three methods of analysis, little significant 
knowledge is gained from the comparisons being made. 

Sachdeva proposes a multivariate analog of Hays' 
omega squared for estimating the strength of relationship 
in a multivariate analysis of variance. The term represents 
the proportion of variation in dependent variable scores 
which is accounted for by the independent variables used 
in the study. Omega squared for the univariate case is 
transformed to the niultivdriate situation by replacing 
sums of squaies with the determinant of the corre- 
sponding matrix of sums of squares and sums of cross- 
products. Multivariate omega squared is then shown to be 
a function of Wilk's lambda test criterion. The author also 
shows that it can be written as a function of an F ratio. 
An example is presented which explains how to solve the 
various formulas derived for multivariate omega squared. 
The procedures are clearly outlined and make a valuable 
contiibution to multivariate methods 

The prediction of teacher turnover using time series 
analysis was researched by Costa Two-year and three-year 
moving averages, as well as exponential smocahing using a 
0 I and 0.9 smoothing factor, were the assessment 
technic|ues which he employed. Demographic data was 
eimtbmed with time-series forecasting methods for pre- 
diction purposes. The demographic variables of sex, age, 
marital status, and years of experience were used to 
identify different types of teachers. Moving average and 



exponential smoothing results were obtained on each 
teacher type, and the percent accuracy in predicting 
turnover rate is reported. Generally the accuracy was 
above 40 per cent. Not unexpectedly,^ ^oung married 
women with less than 4 years of experience showed the 
higliest rate of lurnover The author's use of the Chi- 
square test of independence to test the equality of 
non-independent proportions over the four methods is 
inappropriate, and information as to which technique 
would give the best prediction was inconclusive 

Rogers investigates the utility of the jackknife for 
establishing confidence intervals on and testing hypotheses 
about the disattenuated correlation coefficient tor small 
samples. If a person's score is conceived in terms of two 
components designated as true score and eiror scuie, a 
disattenuated correlation expresses a relationship between 
the true scores on two different instruments. Following 
an extremely thorough review of the literature, the 
jackknife procedure is explained and illustrated Essen- 
tially it is a procedure for obtaining approximate confi- 
dence intervals when standard statistical procedures can- 
not be applied. 

Using a computer to simulate data, sampling distribu- 
tions of disattenuated correlation coefficients were ob- 
tained for different combinations of input parameters 
Characteristics of these distributions with respect to 
central tendency, variability, skewness, and kurtosis are 
described in detail. The best results were obtained from 
sampling distributions which were approximately normally 
distributed and had a large variance. For developing 
confidence intervals when N is small, the jackknife was 
found to be superior to procedures based upon normal 
theory. 

Hubert y and Blommers compare three indices of 
predictor variable potency in order to ascertain the 
contribution of each variable toward the disci iimnation 
process over repeated sampling The indices were. (1) 
the scaled weights of the first Fisher-type discriminant 
f imetit)u, and, (2) the total and within groups eoriclation 
estimates between each predictor variable and the first 
Fisher-type function. Alter describing the formulation of 
multiple group discriminant analysis procedures, the cri- 
terion fur assessing stability of predictor variable potency 
is discussed in detail Essentially it involved the observa- 
ti<M. of variable rank consistency ovei repeated replica- 
tions of the experiment Relationships among the ranks 
were determined by a iputing KendalPs coefficient ol 
concordance The indices based on eorielation Citmiates 
were found to be somewhat moie leliable than the one 
computed on scaled weights The rankings were so scat- 
tered howevei.^ that unless 'he sample si/e was very large, 
none of the indices of variuole potency cm\(l be relied 
upon to give consistent results 



A discussion of interaction analysis and how it relates 
to a one-dependent Markoff chain is presented by Pena. 
The purpose of investigating the relationship between 
these two procedures is to test the order of dependence 
of the interaction chain and to evaluate empirically the 
power of Darwin', criterion, as well as show its relevance 
to cducationa' iiiuations. Darwii/s Likeliliood Ration 
Criterion tests whether two or more matrices composed 
of conditional probabilities are equal. 

Data collected on sixth grade teachers over five subject 
areas is used to test for a one-dependent chain among 
events. Using a Chi-square test of significance it was found 



that a two-dependent model provides a better fit to 
interaction data than the one-dependent nodel based 
upon Darwin's criterion. A development of the Likelihood 
Ratio statistic for a Markoff chain of order two is also 
presented, and possible adjustments in Darwin's criteria m 
order to reliably analyze data on a one-dependent chain 
are discussed. The sensitivity of the criteria was evaluated 
by observing the power of the test. For the conditions 
imposed, power was found to be very near one. A 
comparison between the empirical distribution of Dar- 
win's criteria and the Chi-square distribution reveals a 
close fit for a sequence length of 500. 



FACTOR ANALYSIS 



The papers in this section represent a wide variety of 
current research under the broad heading of "factor 
analysis." Principal component analysis is utilized and 
discussed frequently, especially in the work of Hakstian 
Topics researched using factor analytic metliods include 
an assessment of students' ratings of courses and instruc- 
tors, the study of relationships between cognitive abilities 
tests and concept attainment measures, and the compari- 
son of different measures of association. 

Through the use of principal component analysis, 
Magooti and Price researched the factor dimensions 
produced from student ratings of course and instructor 
characteristics. They hypothesized that the rated charac- 
teristics reflect the raters' preconceptions of course and 
instructor interrelationships and are not necessarily related 
to actual course characteristics and instructor behavior. 

A very thorough review of the literature is followed by 
an analysis of three sets of rating data obtained from an 
instrument consisting of 22 items. One set of data 
consists of between class ratings, another, within class 
ratings, and a third set was completed prior to the start 
of the course. All sots are submitted to a principal 
component analysis, with the first four unrotated factor 
dimensions being compared by means of Tucker's coeffi- 
cient for factor congruence. Interrater reliability is com- 
puted for selected raters and was somewhat low. Results 
show the principal component loadings to be quite similar 
across samples. The authors conclude that the ratings 
reveal more about student preconceptions than the frame- 
work of meaningful instructional quality. 

Plans for investigating the relationship between some 
cognitive abilities tests and concept attainment measures 
are reported by Harris. The eventual purpose is to 
identify ^hose cognitive abilities that aie related to 
concept attainment in loui subject matter areas. Thiee 
approaches to analyzing the relationship between two sets 
of Variables are outlined. (1) lumping all variables 
together and factor analyzing; (2) factor analyzing one 



fundamental data set and correlating variables in the other 
set with the factor scores; and (3) employing canonical 
variate analysis and interbattery factor analysis to check 
the stability of factors over different test selections. A 
summary of projected computations is presented at the 
conclusion. 

Keown and Hakstian compare rive measures of associ- 
ation with respect to stability and robustness of corr- 
elation and rotated factor matrices for seven point Likert 
scale data. The five measures of association are: (1) 
Pearson's r; (2) tetrachoric r; (3) phi coefficient; (4) phi 
divided by phi max statistic; and (5) Kendall's Tau-B. 
Data conforniipg to 20 Likert scale variables for five 
different distributions is generated by the computer. The 
distributions chosen for study are referred to as m)rmal, 
rectangular, central, positive skew and mixed skew. For 
each one, five correlation matrices corresponding to the 
five measures of association are generated among the 
Likert scaled variables. All correlation matrices are then 
subjected to a principal component analysis and rotated 
factor matrices are obtained. 

Three measures of robustness are computed from the 
correlation and component pattern matrices. Results from 
the normal distribution are compared with findings from 
the four other distorted distributions. Comparisons be- 
tween the distributions for the two procedures indicate 
that Tau-B followed by Pearson's r are least affected by 
distribution distortion. The effects on each measure of 
association are discussed in detail. Those measures which 
initially require a splitting of the data at the median not 
only create a loss of information, but also produce 
correlational and factorial results which are less than 
optimal. 

To evaluate the degree of goodness of fit oi patterns 
derived from a principal component analysis, Skakum, 
Maguire and Hakstian develop an empirical sampling 
distribution of the average tnice statistic and use the 
statistic to look for similarities between ct)mponent 



structures They discuss several approaches to factor 
congruence and look at the differences that exist between 
two matrices following rotation. 

From a population component score matrix, pairwise 
samples of 50 component scores were randomly selected. 
The average trace was then computed for each pairwise 
sample. The sampling distribution of the average trace was 
found to be positively skewed, consequently, a square 
root transformation was applied to each trace to create an 
approximate normal distribution. Properties of the trans- 
formed data are leported, and three examples are pre- 
sented to illustrate application of the average trace 
statistic. 

Hakstian develops a number of factor analytic stra- 
tegies for handling longitudinal data collected on the same 
individuals over two different occasions. The five models 
introduced vary with respect to the stability of compo- 
nent scores and factor pattern matrices. Elaborate descrip- 
tions and derivations of each model are presented; four 
are developed by least square methods, while a fifth relie: 
upon canonical correlation procedures. Empirical ex- 
amples are computed for situations where. (1) compo- 
nent scores and factor pattern matrices are constant, (2) 
component scores are constant and factor pattern ma- 
trices are variable: and (3) component scores and factor 
pattern matrices are variable. In the first two examples 
data was simulated on a computer. Correlations are 
computed within and between occasions, between true 
and estimated component scores, and between true and 
estimated rotated pattern matrices. Results indicate a high 
degree of correspondence between true and estimated 
scores. 

Researching further into factor theory, Hakstian and 
Muller discuss ways of determining the significant number 
of factors in a behavioral experiment. They start by 
summarizing the explanatory and taxonomic views of 
factor analysis. A review of three standard factor analysis 
models defined as component, incomplete component, 
and common-factor is then followed by a compilation of 
literature pertaining to the number of potential factors 
over n variables. Major work by Guttnian and Kaiser is 
given special attention. 

As a prelude to the experimental study, the authois 
review and compare some of the more commonly used 
rules for deternuning the appropriate number of factors. 
Data from seventeen correlation mntrices appearing in the 
literature are analyzed using eight different rules for 
finding the number of factors. Finding little agreement 
between the various procedures, the authors conclude that 
the appropriate number of factors depends upon the 
factor analytic model and procedures selected, as well as 



the interpretab'lity of the factors. 

Related to the paper by Hakstian and Mu. . - -^-^^ by 
Dzuiban and Harris in which they empirically wVaiua* the 
meaningfulness of components in a principal c.-mpcicnt 
analysis. They point out that extracting ..mponcnts 
where the eigenvalues are greater than oi. ay n^t 
always produce interpretable results. Bartlett'ij 7cs» ^ 
Sphericity is recommended as one safe<mard against 
performing an inappropriate principal component uny 
but it too IS fallible. Lack of statistical significance With 
respect to Bartlett's test implies that principal component 
analysis may be an inappropriate method for analyzing 
data. 

Citing data previously analyzed by principal compo- 
nent analysis after Bartlett's test was found to be 
significant, the authors indicate that a meaningful inter- 
pretation of two of the components is in question. They 
recommend selecting another model and proceed to 
reanalyze the data using image component analysis, 
uniqueness rescaling factor analysis and alpha factor 
analysis. In this illustration, image analysis is found to 
offer protection against interpreting random variables as 
forming the basis of a meaningful component. The paper 
is noteworthy and should be a warning to all potential 
users of principal component analysis. 

Pruzek, Stegman and Pfeiffer discuss a general method 
for analyzing data which has been partitioned in clusters. 
They study the relationship between partitions in order to 
evaluate structural similarities. A measure of the goodness 
of fit of an empirical cluster of items to some theoretical 
cluster is^ developed, and properties of the proposed test 
statistic, q, are discussed. To illustrate the method, they 
use an example in which 50 students are asked to 
partition 26 items into at least five and not more than 
nine categories. Two different target partitions are selec- 
ted for purposes of analysis* target 1 is an a priori 
splitting of items on the basis of specific item character- 
istics, whereas target 2 is based on results (rom a latent 
partition analysis. Values of tlie test statistic tend to be 
smaller for target 2. The authors also outline strategies for 
studying partitions relative to other methods outlined in 
the literature 

In summary, 30 papers were reviewed under the 
heading of educational statistics For purposes of conve- 
nience in reading, they were broken down into five broad 
areas of intercbt. The two areas leceiving the most 
attention were factor analysis and regression theory. The 
quality and rigor of the research presented in 1972 
appeared (o be supeiior to presentations of a year agu. 
Perhaps we are making progress after all.. 
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